home *** CD-ROM | disk | FTP | other *** search
- CONVERT.DOC
- 02/01/94
-
- The CONVERT.EXE program converts data between several basic data formats.
- Features:
-
- * You can read data in from any of the following data formats:
- ASCII-delimited
- fixed field
- dBase-compatible
- * You can write data out in any of the following data formats:
- ASCII-delimited
- fixed field
- dBase-compatible
- WKS (Lotus 1-2-3 release 1 compatible)
- * For ASCII-delimited files, you can specify the delimiters used between
- fields as well as around numeric or character data
- * For dBase input files, you can retain deleted records if you want
- * You can resize variables or drop them entirely if desired
- * You can specify up to 10 include filters; all records processed must meet
- at least one of these filter conditions
- * You can specify up to 10 exclude filters; any records which meet these
- filter conditions are dropped
- * For WKS output files, you can process input files bigger than Lotus
- 1-2-3 itself can handle (1-2-3 limits input records to being 240 characters
- or less
-
-
- Program written by:
- Bruce Guthrie
- Room H-4885
- U.S. Dept of Commerce/ESA/OBA/BSISD
- Washington, D.C. 20230
-
- (202) 482-3234
-
- You may freely copy and re-distribute this program; however, the U.S.
- Department of Commerce neither guarantees nor assures compatibility of the
- program with all computer software or hardware.
-
- Foreign users: Please provide an Internet address in all correspondence or
- and just e-mail your problems to me at bgu@cu.nih.gov
-
-
-
- Data format types:
-
- An ASCII-delimited file is one which typically has double quotes around each
- character field (the quotes are optional in CONVERT) and typically commas
- between fields. Leading and trailing spaces are removed from character as well
- as all other values.
-
- A fixed-field file places each field in the same column positions on each
- record. The lengths of the fields are the same from record to record.
-
- As an example, these might be an ASCII-delimited records:
-
- "Economic Bulletin Board","202 482-3870",35
- "EBB High-Speed","202 482-2584",100
-
- Fixed-field file records might look like this for the same data:
-
- Economic Bulletin Board 202 482-3870 35
- EBB High-Speed 202 482-2584 100
-
- WKS files are supported directly by Lotus 1-2-3 (all versions) as well as most
- other spreadsheet programs. DBF files are supported by dBaseIII, dBaseIII+,
- dBaseIV, and most other data base management programs (Paradox etc).
-
-
-
- Field definition file:
-
- Unless you are reading a dBase file, this program requires a field definition
- file to figure out the characteristics for each field and also to set certain
- file characteristics. If you're processing an ASCII-delimited input file, the
- routine can try to create a field definition file for you if desired.
-
- The definition file can be created with any text editor. The definition file
- consists of several records with the following fields separated by spaces.
- Except for the record type indicator (which must begin in column 1), all other
- fields can be placed in any columns:
-
- (1) record type (see below)
- (2) length of field on input
- (3) number of decimal places for numeric data (if you don't know,
- put a "?" here; for non-numeric data, a "0" is fine) on output
- (4) length of field on output
-
- Any characters after the field length are treated as comment fields. You would
- typically use this to enter the field name or column position or any other
- information of use to you.
-
- The data record types accepted by this routine are as follows:
-
- type C = character data
- N = numeric
- L = logical (T or F)
- D = date (in yyyymmdd format)
-
- The data fields should be in the order the fields are found in the source file.
-
- You may also specify up to 10 include filters and up to 10 exclude filters in
- the field definition file. If an include filter is specified, the input record
- must contain at least one of the specified character strings in order to be
- processed. If an exclude filter is specified, any input record which contains
- any of the specified character strings will be ignored. Filters are case
- sensitive (capitalization matters) and processed as "or" items (if any filter is
- met, the condition is met; filters are not combined to determine fulfillment).
- The filters are specified in the field definition file in the following ways:
-
- /+=string include filter, the string "string" can appear anywhere
- /S+=string include filter, the string "string" is at the beginning
- of the record
- /+S=string include filter, the string "string" is at the end of the
- record
- /-=string exclude filter, the string "string" can appear anywhere
- /S-=string exclude filter, the string "string" is at the beginning
- of the record
- /-S=string exclude filter, the string "string" is at the end of the
- record
-
-
- For example, if you want to specify in your control file that you only want
- records that contain either "Japan" or "France" *and* you want to exclude any
- records that begin with an underscore character, you would need to include the
- following three filter statements:
-
- /+=Japan
- /+=France
- /S-=_
-
- The character string can include hexadecimal codes (in the &Hxx format) or
- decimal codes (in the \ddd format) if necessary.
-
- Note that for fixed field files, you have to account for every byte in the file.
- If you have something like this:
-
- 12345678_1_2345678_2_2345678_3 (column positions)
- APPLE X Y 12 BANANAS
-
- Even though you may think you only have five fields, the following .DEF file
- will NOT work:
-
- ; Bad .DEF file: Note does not account for blank spaces
- C 8 0 8 Fruit1
- C 1 0 1 Class1
- C 1 0 1 Class2
- N 5 0 5 Value
- C 11 0 11 Fruit2
-
- You may want the Fruit1 field to be in columns 1 through 8 and Class1 to be in
- column 10 but the routine will not know to skip column 9 so it will start
- reading Class1 beginning in column 9, Class2 beginning in column 10, etc. To
- drop the blank positions, you have to add dummy fields on input and ask for them
- to be dropped on output:
-
- ; Good .DEF file: Spaces between fields are accounted for
- C 8 0 8 Fruit1
- C 1 0 0 Filler
- C 1 0 1 Class1
- C 1 0 0 Filler
- C 1 0 1 Class2
- C 1 0 0 Filler
- N 5 0 5 Value
- C 1 0 0 Filler
- C 11 0 11 Fruit2
-
- You can also use the input field length and output field lengths to either drop
- fields using other formats (by specifying a zero length for the output field
- length) or for creating fields on output (by specifying a zero length for the
- input field length). You can also use this to expand on contract a field. For
- example, if Fruit1 is 8 characters long but you only want it to occupy 4
- characters on output (thus the field would be truncated), specify 8 for the
- input field length and 4 for the output field length.
-
- For dBase files, a deffile will automatically be created for you unless you
- specify /DEF. In that case, the routine will read the deffile and act
- appropriately. This is necessary if you want fields expanded or ignored.
-
-
- Syntax:
-
- CONVERT infile [ outfile [ deffile ] ] [ [ /DEF=deffile ] [ /DEF | /-DEF ]
- [ /OVERWRITE | /-OVERWRITE | /APPEND | /OVERASK ]
- [ /FROM FIXED | /FROM ASCII | /FROM DBF ] [ /DELETED ]
- [ /TO FIXED | /TO ASCII | /TO WKS | /TO DBF ] [ /HEADER | /-HEADER ]
- [ /DELIMS=aroundstrings,aroundnums,betweenfields ] [ /BEEP | /-BEEP ]
- [ /INMISS=val ] [ /INMISSC=val ] [ /OUTMISS=val ] [ /OUTMISSC=val ]
- [ /SKIP | /MISSING | /ABORT ] [ /Q ] [ /? | /?&H ]
-
- "infile" is the file specification for the ASCII-delimited or fixed-field file
- you want converted. You can specify a drive and path specification if
- necessary. This parameter is required.
-
- "outfile" is the file specification of the file you want to create. You can
- specify a drive and path specification if necessary. If no outfile is provided,
- the routine will presume you want the output file called the same thing as the
- infile but you want the extension to be ".FIX" (if the output file is a
- fixed-field file), ".PRN" (if the output file is ASCII-delimited), or ".WKS" (if
- the output file is to be in a WKS format).
-
- "deffile" is the file specification for the field definition file. You can
- specify a drive and path specification if necessary. If no deffile is
- specifically provided, the routine will presume it is called the same thing as
- the infile but it has the extension of ".DEF". Note that the deffile can only
- be provided if you also specify the outfile name.
-
- "/DEF=deffile" is the same thing as the "deffile" parm alone above but it's
- not position specific so you can leave out the output file name if desired.
-
- "/DEF" applies to dBase input files only. It tells CONVERT that you want to
- read the field information from an existing field-definition file instead of
- reading the definition in dBase itself. Normally, this is only done if you want
- the program to expand, condense, or eliminate some fields.
-
- "/-DEF" applies to ASCII-delimited input files only. It tells CONVERT to scan
- the first few records in the input file and to figure out the field-definition
- file for you. There are some risks in this primarily in that the routine may
- guess a field type incorrectly. This also takes awhile. The routine will
- automatically write out the field-definition file for you if you select this
- option.
-
-
- "/OVERWRITE" says to overwrite the output file if it exists already.
-
- "/-OVERWRITE" says to abort if the output file exists already.
-
- "/APPEND" says to append (add) to the output file if it exists already. This
- option is only available if you're creating either a fixed-field or ASCII-
- delimited output file.
-
- "/OVERASK" says to ask if the output file exists already. This is typically
- the default but it's CONFIGWS-able.
-
- "/FROM FIXED", "/FROM ASCII", and "/FROM DBF" specifies the format for the input
- file. The routine will typically read the file and guess its format for you.
- You can also specify a default by using the CONFIGWS.EXE program.
-
- "/DELETED" applies for dBase input files only. It says you want to retain
- deleted records. Otherwise, they're dropped in the output file.
-
- "/TO FIXED", "/TO ASCII", "/TO WKS", and "/TO DBF" tells the routine what sort
- of output file you'd like to create. Typically defaults to "/TO WKS" although
- you can change this using the CONFIGWS.EXE program.
-
- "/HEADER" is used in conjunction with WKS output files. If /HEADER is in
- effect, the first row of the spreadsheet will contain the variable name for the
- cell as provided in your control file. If none are provided, the field names
- will be FIELD_01 onward. /-HEADER turns this off and is the default. Note that
- the header line (if any) will show up in the output counts.
-
- "/DELIMS=aroundstrings,aroundnums,betweenfields" allows you to specify the
- delimiters (in sequence) around string fields, around numeric fields, and
- between fields. Defaults to:
-
- /DELIMS=",,,
-
- (Use quotes around character strings, nothing around numeric data, and the third
- comma indicates that there is a comma between fields.) The replacement string
- can include hexadecimal codes (in the &Hxx format) or decimal codes (in the \ddd
- format) if necessary so either of the following would put a tab between fields:
-
- /DELIMS=",,&H09
- /DELIMS=",,\009
-
- See the table of hexadecimal and decimal codes at the end of this documentation.
-
-
- "/BEEP" beeps when the program is finished.
-
- "/-BEEP" reverses /BEEP. This is normally the default but you can override this
- using the CONFIGWS.EXE program.
-
- "/INMISS=val" specifies that any numeric value that has the character string
- representation of "val" will be considered missing. Note that this is an exact
- character string comparison so /INMISS=1 will not compare to a value of "1.00".
- Defaults to /INMISS=NULL (which translates as spaces).
-
- "/INMISSC=val" specifies that any character string value that has the value of
- "val" will be considered missing. Defaults to /INMISSC= (which shouldn't
- happen very much).
-
- "/OUTMISS=val" specifies that any missing numeric input value will be translated
- on output as "val". For example, "/OUTMISS=N.A." would fill in "N.A." for each
- missing value. Defaults to /OUTMISS=NULL (which translates as spaces).
-
- "/OUTMISSC=val" specifies that any missing character input value will be
- translated on output as "val". Defaults to /OUTMISSC=NULL (which translates as
- spaces).
-
- "/SKIP" says to skip records with bad data values; otherwise the routine
- aborts when it runs into any. /SKIP, /MISSING, and /ABORT are mutually
- exclusive.
-
- "/MISSING" says to presume any missing fields in an ASCII-delimited record
- should be filled in with blanks (for character fields) and 0 for numeric fields.
- Incomplete records are written out (unlike in /SKIP). /SKIP, /MISSING, and
- /ABORT are mutually exclusive. Note that the program will only print out the
- first "bad" record. There may be others that show up after this one.
-
- "/ABORT" says to abort when you run into bad records. This is typically the
- default but you can change this using the CONFIGWS.EXE program. /SKIP,
- /MISSING, and /ABORT are mutually exclusive.
-
- "/Q" turns off the line-by-line status report.
-
- "/?" (or "/HELP" or "HELP" ) gives you the syntax for the command.
-
- "/?&H" gives you a hexadecimal and decimal conversion table.
-
-
-
- Return codes:
-
- CONVERT returns the following ERRORLEVEL codes:
- 0 = no problems, file converted
- 254 = problems for formats of data
- 255 = syntax problems, file(s) not found, or /? requested
-
-
- Restrictions and Caveats:
-
- The program cannot handle dBase files with Memo fields.
-
- Most spreadsheet programs restrict a given field length to being 240 characters
- or less.
-
- Date fields which are in the form "yy-mm-dd" (instead of "yyyymmdd") should be
- declared as character fields instead of date fields.
-
- Date fields converted for WKS files are changed into strings of the "yy-mm-dd"
- format.
-
-
-
- Decimal and hexadecimal codes:
- e.g. "\066\097\116" and "&H426174" both are "Bat"
- +---------------------------------------------------------------------------
- | dec hex chr | dec hex chr | dec hex chr | dec hex chr | dec hex chr |
- +--------------+--------------+--------------+--------------+--------------+
- | \000 &H00 nul| \052 &H34 4 | \104 &H68 h | \156 &H9C ú | \208 &HD0 – |
- | \001 &H01 | \053 &H35 5 | \105 &H69 i | \157 &H9D ù | \209 &HD1 — |
- | \002 &H02 | \054 &H36 6 | \106 &H6A j | \158 &H9E û | \210 &HD2 “ |
- | \003 &H03 | \055 &H37 7 | \107 &H6B k | \159 &H9F ü | \211 &HD3 ” |
- | \004 &H04 | \056 &H38 8 | \108 &H6C l | \160 &HA0 † | \212 &HD4 ‘ |
- | \005 &H05 | \057 &H39 9 | \109 &H6D m | \161 &HA1 ° | \213 &HD5 ’ |
- | \006 &H06 | \058 &H3A : | \110 &H6E n | \162 &HA2 ¢ | \214 &HD6 ÷ |
- | \007 &H07 bel| \059 &H3B ; | \111 &H6F o | \163 &HA3 £ | \215 &HD7 ◊ |
- | \008 &H08 bs | \060 &H3C < | \112 &H70 p | \164 &HA4 § | \216 &HD8 ÿ |
- | \009 &H09 tab| \061 &H3D = | \113 &H71 q | \165 &HA5 • | \217 &HD9 Ÿ |
- | \010 &H0A lf | \062 &H3E > | \114 &H72 r | \166 &HA6 ¶ | \218 &HDA ⁄ |
- | \011 &H0B vt | \063 &H3F ? | \115 &H73 s | \167 &HA7 ß | \219 &HDB € |
- | \012 &H0C pg | \064 &H40 @ | \116 &H74 t | \168 &HA8 ® | \220 &HDC ‹ |
- | \013 &H0D cr | \065 &H41 A | \117 &H75 u | \169 &HA9 © | \221 &HDD › |
- | \014 &H0E | \066 &H42 B | \118 &H76 v | \170 &HAA ™ | \222 &HDE fi |
- | \015 &H0F | \067 &H43 C | \119 &H77 w | \171 &HAB ´ | \223 &HDF fl |
- | \016 &H10 | \068 &H44 D | \120 &H78 x | \172 &HAC ¨ | \224 &HE0 ‡ |
- | \017 &H11 | \069 &H45 E | \121 &H79 y | \173 &HAD ≠ | \225 &HE1 · |
- | \018 &H12 | \070 &H46 F | \122 &H7A z | \174 &HAE Æ | \226 &HE2 ‚ |
- | \019 &H13 | \071 &H47 G | \123 &H7B { | \175 &HAF Ø | \227 &HE3 „ |
- | \020 &H14 | \072 &H48 H | \124 &H7C | | \176 &HB0 ∞ | \228 &HE4 ‰ |
- | \021 &H15 | \073 &H49 I | \125 &H7D } | \177 &HB1 ± | \229 &HE5 Â |
- | \022 &H16 | \074 &H4A J | \126 &H7E ~ | \178 &HB2 ≤ | \230 &HE6 Ê |
- | \023 &H17 | \075 &H4B K | \127 &H7F | \179 &HB3 ≥ | \231 &HE7 Á |
- | \024 &H18 | \076 &H4C L | \128 &H80 Ä | \180 &HB4 ¥ | \232 &HE8 Ë |
- | \025 &H19 | \077 &H4D M | \129 &H81 Å | \181 &HB5 µ | \233 &HE9 È |
- | \026 &H1A eof| \078 &H4E N | \130 &H82 Ç | \182 &HB6 ∂ | \234 &HEA Í |
- | \027 &H1B esc| \079 &H4F O | \131 &H83 É | \183 &HB7 ∑ | \235 &HEB Î |
- | \028 &H1C | \080 &H50 P | \132 &H84 Ñ | \184 &HB8 ∏ | \236 &HEC Ï |
- | \029 &H1D ???| \081 &H51 Q | \133 &H85 Ö | \185 &HB9 π | \237 &HED Ì |
- | \030 &H1E ???| \082 &H52 R | \134 &H86 Ü | \186 &HBA ∫ | \238 &HEE Ó |
- | \031 &H1F ???| \083 &H53 S | \135 &H87 á | \187 &HBB ª | \239 &HEF Ô |
- | \032 &H20 | \084 &H54 T | \136 &H88 à | \188 &HBC º | \240 &HF0 |
- | \033 &H21 ! | \085 &H55 U | \137 &H89 â | \189 &HBD Ω | \241 &HF1 Ò |
- | \034 &H22 " | \086 &H56 V | \138 &H8A ä | \190 &HBE æ | \242 &HF2 Ú |
- | \035 &H23 # | \087 &H57 W | \139 &H8B ã | \191 &HBF ø | \243 &HF3 Û |
- | \036 &H24 $ | \088 &H58 X | \140 &H8C å | \192 &HC0 ¿ | \244 &HF4 Ù |
- | \037 &H25 % | \089 &H59 Y | \141 &H8D ç | \193 &HC1 ¡ | \245 &HF5 ı |
- | \038 &H26 & | \090 &H5A Z | \142 &H8E é | \194 &HC2 ¬ | \246 &HF6 ˆ |
- | \039 &H27 ' | \091 &H5B [ | \143 &H8F è | \195 &HC3 √ | \247 &HF7 ˜ |
- | \040 &H28 ( | \092 &H5C \ | \144 &H90 ê | \196 &HC4 ƒ | \248 &HF8 ¯ |
- | \041 &H29 ) | \093 &H5D ] | \145 &H91 ë | \197 &HC5 ≈ | \249 &HF9 ˘ |
- | \042 &H2A * | \094 &H5E ^ | \146 &H92 í | \198 &HC6 Δ | \250 &HFA ˙ |
- | \043 &H2B + | \095 &H5F _ | \147 &H93 ì | \199 &HC7 « | \251 &HFB ˚ |
- | \044 &H2C , | \096 &H60 ` | \148 &H94 î | \200 &HC8 » | \252 &HFC ¸ |
- | \045 &H2D - | \097 &H61 a | \149 &H95 ï | \201 &HC9 … | \253 &HFD ˝ |
- | \046 &H2E . | \098 &H62 b | \150 &H96 ñ | \202 &HCA | \254 &HFE ˛ |
- | \047 &H2F / | \099 &H63 c | \151 &H97 ó | \203 &HCB À | \255 &HFF ˇ |
- | \048 &H30 0 | \100 &H64 d | \152 &H98 ò | \204 &HCC Ã | |
- | \049 &H31 1 | \101 &H65 e | \153 &H99 ô | \205 &HCD Õ | |
- | \050 &H32 2 | \102 &H66 f | \154 &H9A ö | \206 &HCE Œ | |
- | \051 &H33 3 | \103 &H67 g | \155 &H9B õ | \207 &HCF œ | |
- +--------------+--------------+--------------+--------------+--------------+